A Hebrew Tree Bank Based on Cantillation Marks

نویسندگان

  • Andi Wu
  • Kirk Lowery
چکیده

In the Masoretic text of the Hebrew Bible (HB), the cantillation marks function like a punctuation system that shows the division and subdivision of each verse, forming a tree structure which is similar to the prosodic tree in modern linguistics. However, in the Masoretic text, the structure is hidden in a complicated set of diacritic symbols and the rich information is accessible only to a few trained scholars. In order to make the structural information available to the general public and to automatic processing by the computer, we built a tree bank where the hierarchical structure of each HB verse is explicitly represented in XML format. We coded the punctuation system in a context-tree grammar which was then used by a CYK parser to automatically generate trees for the whole HB. The results show that (1) the CFG correctly encoded the annotation rules and (2) the annotation done by the Masoretes is highly

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

From Prosodic Trees to Syntactic Trees

This paper describes an ongoing effort to parse the Hebrew Bible. The parser consults the bracketing information extracted from the cantillation marks of the Masoetic text. We first constructed a cantillation treebank which encodes the prosodic structures of the text. It was found that many of the prosodic boundaries in the cantillation trees correspond, directly or indirectly, to the phrase bo...

متن کامل

Language Support A Simple Technique for Typesetting Hebrew with Vowel Points

This paper describes a simple mechanism for typesetting Hebrew with vowel points. Hebrew uses a large set of accents that represent vowels, consonant modifiers, and cantillation instructions. These accents are placed above, below, or inside letters; a single letter can carry several accents. The solution that we describe, which is designed for PostScript [2] output devices, leaves the placement...

متن کامل

BUILDING A HEBREW TREE-BANK Building a Tree-Bank of Modern Hebrew Text

This paper describes the process of building the first tree-bank for Modern Hebrew texts. A major concern in this process is the need for reducing the cost of manual annotation by the use of automatic means. To this end, the joint utility of an automatic morphological analyzer, a probabilistic parser and a small manually annotated tree-bank was explored. An initial tree-bank that consists of 50...

متن کامل

Building a Tree-Bank of Modern Hebrew Text

This paper describes the process of building the first tree-bank for Modern Hebrew texts. A major concern in this process is the need for reducing the cost of manual annotation by the use of automatic means. To this end, the joint utility of an automatic morphological analyzer, a probabilistic parser and a small manually annotated tree-bank was explored. An initial tree-bank that consists of 50...

متن کامل

Vowel reduction in Modern Hebrew: Traces of the past and current variation

The aim of this paper was to find out the scope and boundaries of a-reduction in Modern Hebrew. In Classical Hebrew, vowel reduction was a regular, obligatory process. In Modern Hebrew, it has restricted scope and operates under opaque conditions. The only reliable trace of the historical motivation for the rule is the Hebrew vocalization system (nikud). 100 participants in four age groups were...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006